Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend nullable pointer optimization to library types #36

Closed
wants to merge 3 commits into from
Closed

Extend nullable pointer optimization to library types #36

wants to merge 3 commits into from

Conversation

emberian
Copy link
Member

@emberian emberian commented Apr 6, 2014

No description provided.

@thestinger
Copy link

In my opinion, a field should be marked with #[not_null] instead of the type as a whole. This is required for this to work for types like Vec<T>.

@lilyball
Copy link
Contributor

lilyball commented Apr 6, 2014

I like this, although I don't believe it can work with Vec<T>. Vec { ptr: 0, ... } is a valid vector of zero capacity, and in fact is what Vec::new() returns today.

@emberian
Copy link
Member Author

emberian commented Apr 6, 2014

@kballard that's just the example @thestinger gave me

@lilyball
Copy link
Contributor

lilyball commented Apr 6, 2014

@cmr Sure. Which is why I still approve of this RFC overall. I just wanted to make it clear that it did not, in fact, apply to Vec<T>.

@huonw
Copy link
Member

huonw commented Apr 6, 2014

In my opinion, a field should be marked with #[not_null] instead of the type as a whole

We could have "built-in" NotNull<T> { x: *mut T } which takes the attribute, similar to how we have Unsafe<T>. Although I guess we'd want ImmutNotNull<T> { x: *T } too, which is a little unfortunate.

Then something like a library Uniq would be

struct Uniq<T> {
    data: NotNull<T>
}

@emberian
Copy link
Member Author

emberian commented Apr 6, 2014

@huonw note that the rfc has been updated making that obsolete. I'm against adding more lang items for this when the attribute works well, and only affects the (undefined) enum representation.

@huonw
Copy link
Member

huonw commented Apr 6, 2014

It doesn't have to be a proper #[lang] item, it can just be a #[unsafe_not_null] attribute on the NotNull type. "Real" types have the advantage of being easily documentable and searchable, while attributes are much harder (just a list in the manual or something).

(I was just mentioning it as a possibility, I don't have any skin in this game. ;P )

@lilyball
Copy link
Contributor

lilyball commented Apr 6, 2014

@huonw If we had NotNull<T> I would expect it to look like

struct NotNull<T> {
    #[unsafe_not_null]
    data: T
}

so your Uniq<T> would have to have a field NotNull<*T>.

That said, I don't think it's necessary. The NotNull aspect here is to enable an optimization, not for any semantics or correctness reasons.

@alexcrichton
Copy link
Member

This is quite an unsafe attribute, so while I'm glad that the word unsafe is in the attribute, it doesn't deter anyone from using it really (see unsafe_destructor). I would very much rather this require the use of unsafe blocks rather than just attributes. (such as with @huonw's suggestion of some NotNull type or some related solution).

Additionally, this is semi-incompatible with unsafe_no_drop_flag because the pointer is indeed null when the destructor is run for the second time. Depending on what we tell LLVM or how code is written, it could be even more unsafe than originally anticipated (just something to keep in mind).

@emberian
Copy link
Member Author

emberian commented Apr 7, 2014

It already require unsafe blocks to use the raw pointers. I see no evil
with this. After drop has run I view the contents of the value entirely
100% undefined, and so I also see no evil with this. I don't foresee
telling LLVM anything about this, I expect to implement this as just

On Sun, Apr 6, 2014 at 7:03 PM, Alex Crichton notifications@gh.neting.ccwrote:

This is quite an unsafe attribute, so while I'm glad that the word unsafeis in the attribute, it doesn't deter anyone from using it really (see
unsafe_destructor). I would very much rather this require the use of
unsafe blocks rather than just attributes. (such as with @huonwhttps://github.com/huonw's
suggestion of some NotNull type or some related solution).

Additionally, this is incompatible with unsafe_no_drop_flag because the
pointer is indeed null when the destructor is run for the second time.
Depending on what we tell LLVM or how code is written, it could be even
more unsafe than originally anticipated (just something to keep in mind).


Reply to this email directly or view it on GitHubhttps://github.com//pull/36#issuecomment-39686254
.

http://octayn.net/

@emberian
Copy link
Member Author

emberian commented Apr 7, 2014

... an extension of Nullable in trans::adt.

On Sun, Apr 6, 2014 at 10:47 PM, Corey Richardson corey@octayn.net wrote:

It already require unsafe blocks to use the raw pointers. I see no evil
with this. After drop has run I view the contents of the value entirely
100% undefined, and so I also see no evil with this. I don't foresee
telling LLVM anything about this, I expect to implement this as just

On Sun, Apr 6, 2014 at 7:03 PM, Alex Crichton notifications@gh.neting.ccwrote:

This is quite an unsafe attribute, so while I'm glad that the word unsafeis in the attribute, it doesn't deter anyone from using it really (see
unsafe_destructor). I would very much rather this require the use of
unsafe blocks rather than just attributes. (such as with @huonwhttps://github.com/huonw's
suggestion of some NotNull type or some related solution).

Additionally, this is incompatible with unsafe_no_drop_flag because the
pointer is indeed null when the destructor is run for the second time.
Depending on what we tell LLVM or how code is written, it could be even
more unsafe than originally anticipated (just something to keep in mind).


Reply to this email directly or view it on GitHubhttps://github.com//pull/36#issuecomment-39686254
.

http://octayn.net/

http://octayn.net/

@bill-myers
Copy link

How about changing raw pointers so that they are not allowed to be null?

Then, *T will represent a non-null pointer, and Option<*T> will represent a nullable raw pointer, just like in the rest of the language.

If this is too verbose, one can perhaps rename Option to Opt, or can add a Ptr<T> alias for Option<*T>.

@lilyball
Copy link
Contributor

Why? What benefit will this bring? Raw pointers are not guaranteed to be valid, so preventing them from being NULL doesn't really add any safety. If you want a valid, non-null pointer, use &T. And if you want to disable borrowck, use transmute() to change the lifetime to 'static.

Non-nullable raw pointers only serves to make them even harder to use.

@bill-myers
Copy link

Well, it brings the benefit that they behave like borrowed pointers, that you don't need a concept of "null" in the language in addition to None, and you can use or not use Option instead of adding this new unsafe_not_null attribute.

&T is not just non-null, it's also guaranteed to not alias &mut X and &U for incompatible U (at least in theory/in future) so you can't "just use &T".

@huonw
Copy link
Member

huonw commented Apr 10, 2014

There is an issue in the main rust repo that discussed (and rejected) *T being non-null. (I'm on my phone so I can't find & link it.)

@emberian
Copy link
Member Author

rust-lang/rust#10571

On Wed, Apr 9, 2014 at 8:32 PM, Huon Wilson notifications@gh.neting.ccwrote:

There is an issue in the main rust repo that discussed (and rejected) *Tbeing non-null. (I'm on my phone so I can't find & link it.)


Reply to this email directly or view it on GitHubhttps://github.com//pull/36#issuecomment-40032080
.

http://octayn.net/

@bill-myers
Copy link

Well, this RFC shows there is a need for something like that.

A more conservative option is to add a new type, let's call it ValidPtr<T>, which behaves exactly like *T, except that it must be a valid pointer (i.e. non-null and pointing to an usable part of the address space).

This is equivalent to this RFC, except it is done cleanly in the type system instead of with an ad-hoc attribute (which means semantics are clear, e.g. the question of who checks whether the value is actually null).

Then, one can address as a separate question whether you should then still have *T or whether Option<ValidPtr<T>> and uint can replace it (with less verbose naming, of course).

BTW, I just realized that talking about "null" vs "not null" is probably a mistake, and we should instead think about "valid now" (points to memory valid now) vs "not valid now" (points to usable address space, but there is no allocation there) vs "never valid" (points to unusable parts of the address space, i.e. space reserved for the kernel or > 2^48 on x86-64) vs "tagged as absent" (null pointer in most ABIs).

This allows the ABI on embedded systems with no MMU that start physical memory as 0 to declare that an "absent" pointer is not represented as 0, but rather as 0xffffffff (which means Option optimization would represent None as 0xffffffff), and have the language continue working normally.

@dobkeratops
Copy link

isnt' the point of unsafe blocks and *T for catching everything the type system can't express. adding more safety or semantics to unsafe concepts seems contradictory.

Wouldn't it be better to leave unsafe {} T as close to C as possible -accept no type system in existance can do *everything, there's always the possibility of tricks or special cases on machines you haven't seen yet.. - but focus on adding more ways to avoid needing unsafe in the first place. (C's power is that it can behave like a super-assembler, and rust's idea of essentially replicating C's ability inside unsafe blocks is great)

eg, - enums have eliminated the need for casting, how about a way of expressing enums that allocate the used space only, can't switch to larger variants..

  • Acess to typesafe vtables and recomposing trait-objects, so you can implement that 'Many' mutli-interface object safely..
  • ways of expressing blob objects, precompiled immutable object graphs in a single allocation.. (i could possible see ways of doing that with '.' being enhanced to carry information down an acessor graph..)
    .. And so on..

@nikomatsakis
Copy link
Contributor

At most recent meeting, we decided to close this RFC. This seems like a fine design but there are important and similar cases (i.e., i31) that it doesn't handle. Since it doesn't feel like this problem must be solved urgently, we'd rather hold off from doing anything until we have time to mull a comprehensive solution. That said, if no comprehensive solution seems to work out, this seems like a good outline for a simple solution.

@nikomatsakis
Copy link
Contributor

Another option would be to have a struct like

#[lang=...]
struct NotNull<T> { pub pointer: *mut T }

and then Rc can use NotNull<T> in place of *T.

This is a lang item because it will be recognized by ADT.

@nikomatsakis
Copy link
Contributor

The reason to use a field pointer is so that users can do (*foo.pointer) rather than foo.deref(), as the latter generates a lot of throw-away code (per the original RFC).

withoutboats pushed a commit to withoutboats/rfcs that referenced this pull request Jan 15, 2017
@petrochenkov petrochenkov removed the postponed RFCs that have been postponed and may be revisited at a later time. label Feb 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants